Protein folding as a jamming transition

Proteins fold to a specific functional conformation with a densely packed hydrophobic core that controls their stability. We develop a geometric, yet all-atom model for proteins that explains the universal core packing fraction of ϕc=0.55 found in experimental measurements. We show that as the hydrophobic interactions increase relative to the temperature, a novel jamming transition occurs when the core packing fraction exceeds ϕc. The model also recapitulates the global structure of proteins since it can accurately refold to native-like structures from partially unfolded states.

While the fact that each protein folds to a specific conformation suggests an equilibrium process, dense packing in protein cores suggests that non-equilibrium processes also occur.For example, as hard particles are compressed, the system becomes rigid and solid-like, i.e. jammed, at a sufficiently large packing fraction φ c [13].However, the φ c and mechanical properties depend on the protocol used to generate the particle packings, and thus jamming is a highly non-equilibrium process [14][15][16][17].In addition, vibrational studies of proteins show that they posses a boson peak, or an abundance of low frequency modes in the density of states, which is a prominent feature of non-equilibrium systems, such as glasses [18][19][20][21].Moreover, recent experiments on the dry molten globule state [22][23][24][25] suggest that the final stages of core formation take much longer than the initial stages of folding [26].Thus, it is important to develop a geomertic, yet atomistically accurate model for proteins, which will allow us to rigorously connect the nonequilibrium physics of hard-particle packings [13,27] to the nearly folded conformational landscape of proteins [28].
In this Letter, we first discuss a hard-sphere (HS) model for proteins with stereochemical constraints and a specific set of atom sizes that recapitulates the allowed backbone and side chain dihedral angle distributions in proteins.We then add attractive atomic interactions that scale with amino acid hydrophobicity (i.e. the HS+HP model) to explore core formation.We find that the HS+HP model collapses as the attractive strength relative to temperature is increased, and similar to jamming, transitions from a floppy to rigid state at φ c ∼ 0.55.Additionally, we find that the potential energy of atomic overlaps scales as a power-law with packing fraction, ⟨V r ⟩ ∼ (φ − φ c ) δ with a novel scaling exponent δ ∼ 9/2.This result suggests that proteins collapse until the core amino acids reach a mechanically stable state that resists the compression induced by the hydrophobic attractions.Moreover, HS+HP model proteins can refold from partially unfolded states, suggesting that the model can recapitulate the protein conformational landscape.
First, to calculate the core packing fraction, the set of atomic diameters {σ i } must be defined.However, the literature provides a wide range of possible {σ i } for the hard-core atomic interactions in proteins [12].Therefore, we propose that {σ i } can be selected by validating the atom sizes against a fundamental feature of protein structure.Ramachandran, et al. first demonstrated that by assuming only repulsive, hardcore atomic interactions, plus the stereochemistry of amino acids, one can predict the backbone dihedral angle pairs ϕ and ψ that occur in proteins are those pairs that do not cause large atomic overlaps [29,30].(See the inset of Fig. 1(c) for the backbone dihedral angle distribution from high-quality x-ray crystal structures of proteins.)We have also validated this approach for the distributions of side chain dihedral angles [31][32][33][34][35].The potential energy for the HS model includes both nonbonded and bonded atomic interactions.For the nonbonded FIG. 1.The nonbonded dimensionless pair potential Ṽi j = V i j /ϵ r plotted versus atomic separation ri j = r i j /σ i j for purely-repulsive interactions (Eq. 1) (solid line) and attractive interactions (Eq.5) (dashed line) and (b) the corresponding dimensionless force Fi j = F i j σ i j /ϵ r .The symbols represent the onset of repulsive interactions where ri j = 1 and Ṽi j = − Ṽc = −V c /ϵ r (circles), the change in spring constant where ri j = rβ = 1 + r β σ i j and Fi j = − βi j = −β λ i j σ i j /ϵ r (squares), and the separation above which the interactions are zero ri j = rα = 1 + r α /σ i j = 1 + α (triangles).(c) The difference ∆ f between the average fraction of backbone dihedral angle outliers (black circles) and side chain dihedral angle outliers (grey squares) between the HS model and proteins from a high-quality x-ray crystal structure database plotted versus the temperature T /ϵ r at which the HS model proteins were simulated.Inset: The probability distribution of backbone dihedral angles P(ϕ, ψ) sampled by high-quality x-ray crystal structures of proteins.The colors from light to dark indicate increasing probability on a logarithmic scale.(d) Probability distribution of the average core packing fraction P(⟨φ ⟩) in high-quality x-ray crystal structures of proteins calculated using the optimized HS atom sizes on a semi-log plot with a Gaussian fit (black dashed line).
interactions, we employ a purely repulsive linear spring potential to prevent atomic overlaps, where ϵ r defines the repulsive energy scale, r i j is the centerto-center distance between atoms i and j, σ i j is their average diameter, and Θ(x) is the Heaviside step-function.(See Fig. 1 (a) and (b).)The total repulsive potential energy includes all atom pairs except those that participate in bonded interactions: V r = ∑ ⟨i, j⟩ ′ V r (r i j ), where ⟨i, j⟩ ′ is the set of nonbonded atom pairs.We add restraints on the bond lengths r i j , bond angles θ i jk , and dihedral angles ω i jkl with rest values, r 0 i j , θ 0 i jk , and ω 0 i jkl that occur in each target protein's high-resolution x-ray crystal structure: where k b = k a = k d = ϵ r are the respective spring constants and σ H is the diameter of hydrogen.(Below, all energy scales will be given in units of ϵ r .)We set the spring constants to be equal to weight nonbonded overlaps and deformations in stereochemistry equally.Additionally, only the dihedral angles ω i jkl needed to maintain high-quality protein stereochemistry are restrained.First, we add restraints to the main chain peptide bond dihedral angle ω i jkl , which due to the peptide bond's partial double-bonded character, is relatively planar in high-quality protein structures.Second, amino acids with side chains containing double bonds require restraints to maintain their planar geometry, such as in the phenylalanine ring.The total potential energy for the HS model is then , ⟨i, j⟩ is the set of bonded atom pairs, ⟨i, j, k⟩ is the set of atom triples that defines each bond angle, and ⟨i, j, k, l⟩ is the set of groups of four atoms that define the dihedral angles.All hydrogens are placed using the REDUCE software [36].When comparing simulation results to experimentally determined protein structures, we use a high-resolution dataset of ∼ 5, 000 structures with a resolution < 1.8 Å culled from the Protein Data Bank (PDB) [37,38].For the HS protein simulations, we carry out Langevin dynamics over range of temperatures 10 −8 < T /ϵ r < 10 −2 using 20 randomly selected, single chain target proteins with no disulfide bonds from the x-ray crystal structure dataset.Sizes range from N aa = 60-335 with an average of ⟨N aa ⟩ = 150 amino acids.(PDBIDs are given in Table S1 and examples of the restraints in Eqs.2-4 are given in Tables S2 and S3 in Supplemental Material (SM).) Using optimized {σ i } (Table S4 of SM) [31][32][33][34][35]39], we compare the backbone and side chain dihedral angles sampled by the HS model and those of high-quality x-ray crystal structures.We use the software package MOLPROBITY to quantify the fraction f of backbone and side chain dihedral angle outliers, with respect to a reference set of high quality x-ray crystal structures [40][41][42].We compare the fraction of backbone and side chain outliers in the HS simulations f s (T /ϵ r ) to the fraction of outliers in our high-resolution xray crystal structure database f x .We show in Fig. 1 (c) that ∆ f (T /ϵ r ) = f s (T /ϵ r ) − f x approaches zero for both backbone and side chain dihedral angles as T /ϵ r decreases (and the HS model approaches the hard-core limit).Note that the HS protein model recapitulates the Ramachandran map even though it has fewer restraints than in typical all-atom protein force fields.For example, in an alanine dipeptide, the HS model includes two dihedral angle restraints, whereas current Amber and CHARMM force fields have 41 dihedral angle restraints [43,44].
With this optimized set of atomic diameters {σ i }, we can calculate the average core packing fraction ⟨φ ⟩ in the highresolution x-ray crystal structure data set as shown in Fig. 1  (d).Core amino acids are those that have relative solvent accessible surface area rSASA < 10 −3 , using the Lee and Richards algorithm with a probe size of a water molecule [45].As we have previously reported [12,46], we find ⟨φ ⟩ ∼ 0.55 ± 0.01.The same result is found for solution NMR structures when only including high quality bundles [47].An important question naturally arises, why does the folding process give rise to this value for ⟨φ ⟩ in all globular protein cores?
To study core formation, we can add attractive interactions to the HS protein model, which yields the HS+HP model.For the nonbonded attractive interactions between atoms, we extend the potential in Eq. 1 to r β /σ i j = 1 + σ i j β i j /σ H and cutoff the interactions at r α /σ i j = 1 + α > r β using piecewise harmonic functions of r i j : where continuity.α defines the attractive range and β i j = β λ i j defines the magnitude of the attractive force.(See Fig. 1 (a) and (b).) λ i j = (λ i + λ j )/2 is the average hydrophobicity associated with atom pairs i and j, where 1 ≤ λ i ≤ 0 is the hydropho-bicity per amino acid and is assigned to each atom on a given amino acid [48].(See Table S5 in SM.) To explore the dynamics of folding for the HS+HP model, we run Langevin dynamics with the HS-energy minimized xray crystal structure of a given protein as the initial condition.We consider 20 randomly selected single-chain protein targets from the high-resolution x-ray crystal structure database and study the folowing parameter regimes: 0.5 ≤ α ≤ 2, 10 −12 ≤ β ≤ 10 −3 , and 10 −8 ≤ T /ϵ r ≤ 10 −6 .In Fig. 2 (a), we show the packing fraction of core residues ⟨φ ⟩ averaged over the 20 proteins versus increasing attractive strength, quantified using α 2 β .Plotting ⟨φ ⟩ versus α 2 β collapses the data for each temperature T /ϵ r .At small α 2 β , the proteins unfold and ⟨φ ⟩ < 0.55.As the attractive interactions increase, a plateau at ⟨φ ⟩ ∼ 0.55 (i.e. at the average packing fraction of experimentally determined protein cores) occurs for α 2 β ∼ T /ϵ r .Increasing the attraction further causes a steep increase in ⟨φ ⟩.As T /ϵ r is lowered, the HS+HP model behaves as a hard-core system and the plateau extends to smaller α 2 β .⟨φ ⟩ versus α 2 β is well fit by where A and B are constants, φ c → 0.55 and the exponents a → 1/3 and b → 2 as T /ϵ r → 0. How are such large values of ⟨φ ⟩ > 0.55 possible in Fig. 2 (a)?When we plot the average total nonbonded repulsive potential energy per atom ⟨V r /N⟩ versus α 2 β in Fig. 2 (b), we find that ⟨V r /N⟩ ∼ V 0 , where V 0 ∼ T /ϵ r for α 2 β < T /ϵ r .However, when α 2 β > T /ϵ r , ⟨V r /N⟩ increases from the plateau value V 0 as a power-law: where C is a constant and c → 3/2 as T /ϵ r → 0. Thus, we find that when ⟨φ ⟩ > 0.55, the total repulsive energy per atom increases strongly, which indicates a jamming transition.
where ∆V r = ⟨V r /N⟩ − V 0 , A = A/C a/c and B = B/C −b/c .When ⟨φ ⟩ − φ c = ⟨∆φ ⟩ ≫ 0, Eq. 8 simplifies to ⟨V r /N⟩ ∼ ⟨∆φ ⟩ δ , where δ = c/a → 9/2 in the T /ϵ r → 0 limit.This result is similar to that found for the jamming transition in particle packings, except with a significantly larger exponent than δ = 2 expected from affine compression.Thus, Fig. 2 shows that the HS+HP model undergoes a jamming transition when the average packing fraction increases above the value observed in x-ray crystal structures of proteins.However, the jamming transition in the HS+HP model has a scaling exponent δ that is more than a factor of two larger than that found previously for hard-sphere systems and bead-spring polymers [49].In the SM, we confirm that the collapse transition in bead-spring polymers with the same nonbonded interactions and only bond-length constraints yields δ = 2, which suggests that the anomalous exponent for the HS+HP model is caused by the unique geometry of amino acids and not from the attractive interactions.A possible source of the anomalous scaling exponent is changes in the number of contacts between amino acids as the core is compressed [50].
Below jamming onset, unjammed systems possess a large number of low frequency, liquid-like modes in the vibrational density of states (VDOS).Near jamming onset, excess intermediate frequencies, known as the boson peak, occur in the VDOS, and as the packing fraction increases above jamming onset the boson peak is suppressed [51,52].We calculate the VDOS from the eigenvalues e n of the displacement correlation matrix S = VC −1 , where V i j = ⟨v i v j ⟩ is the velocity correlation matrix and C i j = ⟨(r i − r 0 i )(r j − r 0 j )⟩ is the positional covariance matrix, v i are the atom velocities, r i are the atom positions, and r 0 i are the average atom positions.The angle brackets indicate time averages.Each eigenvalue e n has a corresponding eigenvector ên = {e 1xn , e 1yn , e 1zn , . . ., e Nxn , e Nyn , e Nzn }.The VDOS D(ω n ) is then obtained by binning the frequencies ω n = √ e n [53,54], where the frequencies are given in units of ϵ r / m H σ 2 H , where m H is the mass of hydrogen.
In Fig. 3 (a), to investigate the rigidification of the HS+HP model, we plot the VDOS of the backbone C α atoms averaged over the 20 proteins for T /ϵ r = 10 −8 .We show D(ω n ) for all α and β values in Fig. 2 and as a function of ⟨V r /N⟩ to identify the jamming transition.When the HS+HP proteins are unjammed with ⟨V r /N⟩ ∼ 10 −10 , the VDOS possesses a large peak of liquid-like modes in the range 10 −2 < ω n < 10 −1 , as well as a secondary peak near ω n ∼ 1 corresponding to the bonded interactions.As ⟨V r /N⟩ increases, the liquid-like peak decreases and the modes at intermediate frequencies fill-in to form a plateau near D(ω n ) ∼ 1.A key sign of the onset of rigidification is the formation of a plateau in the intermediate frequency region of the VDOS, also known as the boson peak [55].The boson peak is suppressed when the system becomes overcompressed with increasing ⟨V r /N⟩.(The results for the VDOS of attractive bead-spring polymers are similar, except the liquid-like modes vanish more rapidly with increasing ⟨φ ⟩ related to the difference in δ , as shown in SM.) Near jamming onset in packings of spherical particles, the vibrational modes in the VDOS plateau region are quasilocalized, i.e. many particles participate in the eigenmodes, but they are not phonon-like [55].We investigate the localization of the modes in the platueau region of D(ω n ) by calculating the participation ratio for each eigenmode, where ⃗ d i (ω n ) = {e ixn , e iyn , e izn } is the contribution of particle i to the nth eigenvector of S [56].In Fig. 3 (b), we plot the binned p r (ω n ) at each ⟨V r /N⟩.A key difference between p r (ω n ) for the unjammed systems (⟨V r /N⟩ ∼ 10 −10 ) and jammed systems is that there is a strong increase in p r (ω n ) for frequencies in the range 10 −1 < ω n < 10 0 , which indicates the development of quasi-localized modes at intermediate frequencies.As expected, the highest frequencies correspond to local excitations.
We demonstrated that during folding, the HS+HP model for proteins undergoes a jamming transition at the average core packing fraction observed in high-resolution x-ray crystal structures.We now quantify whether the backbone atoms of the HS+HP model deviate from the x-ray crystal structures during the jamming process.To do this, we calculate the rootmean-square-deviations (RMSD) in the C α positions between the simulated and experimental protein structures, where⃗ r ms and⃗ r me are the C α positions of the mth amino acid from the simulations and x-ray crystal structures, respectively.We find that ∆ converges rapidly as a function of time, and thus we focus on ∆ f at the last time point in the simulations.We plot ⟨∆ f ⟩ averaged over the 20 proteins in Fig. 4 (a) for the HS+HP simulations presented above.We find that ⟨∆ f ⟩ ∼ 1 Å near jamming onset, confirming that not only the core packing fraction, but also the overall backbone conformation is similar to the x-ray crystal structure near jamming onset.Does the C α RMSD of the HS+HP model relative to the x-ray crystal structures remain small when the simulations are initialized further from the x-ray crystal structure?To study the ability of the HS+HP model to refold proteins, we initialize the HS+HP simulations with conformations at different values of the C α RMSD ∆ i using the HS model conformations, which unfold over time since there are no attractive forces.We then run Langevin dynamics simulations of the HS+HP model at T /ϵ r = 10 −7 over the range 0.5 ≤ α ≤ 5.5 and we set β such that α 2 β ∼ T /ϵ r .
In Fig. 4 (b), we plot the long-time C α RMSD ⟨∆ f ⟩ versus ∆ i for a range of α averaged over all 20 proteins studied.We find that for short attractive ranges (i.e.α ≲ 0.5), while when starting in the crystal structure can lead to a jamming transition, the HS+HP model cannot refold (i.e.⟨∆ f ⟩ ∼ ∆ i ) above ∆ i ∼ 2 Å.As α is increased, the HS+HP model can refold initial states with ∆ i ≲ 5 Å to ⟨∆ f ⟩ ∼ 2 Å, a threshold that is considered properly folded in all-atom MD simulations of protein folding [57].In addition, all HS+HP proteins that refold to form a well-defined core possess ⟨φ ⟩ ∼ 0.55.We also compared these results to those from all-atom MD simulations using the Amber99SB-ILDN force field in explicit water.We find that ⟨∆ f ⟩ ∼ 2 Å when starting near the x-ray crystal structure for PDBID: 2IGP, yet ⟨∆ f ⟩ ∼ ∆ i when ∆ i > 2 Å after running for > 1 µs [43,[58][59][60][61][62].(More details are found in SM.) Here, we have developed a quantitatively accurate model for protein structure in which the stereochemistry of the amino acids is preserved and the atom sizes are optimized to recapitulate the experimentally observed backbone (and side chain) dihedral angle distributions.By adding hydrophobic attractive interactions, we showed that a novel jamming transition occurs during folding at the average core packing fraction of protein x-ray crystal structures.We showed that the total repulsive potential energy versus ⟨∆φ ⟩ obeys power-law scaling above jamming onset with an anomalous exponent δ > 2. In addition, the vibrational response indicates that the HS+HP model rigidifies at φ c = 0.55 with quasi-localized vibrational modes at intermediate frequencies.Thus, we have demonstrated that the core packing fraction observed in high quality experimental protein structures is due to the onset of jamming under hydrophobic compression and have provided a theoretical direction for understanding non-equilibrium properties of proteins.In addition, starting from partially unfolded states with ∆ i ≲ 5 Å, HS+HP proteins can refold to the x-ray crystal structure.We believe that the HS+HP model is well-suited for tackling many open problems in protein science, such as predicting the structural response to amino acid mutations, identifying protein-protein interactions, and understanding protein structure in vivo [63][64][65][66][67].In addition, the HS+HP model can be used to investigate the effects of folding rate on protein core packing, given that the properties of other jammed systems possess strong cooling rate dependence [68][69][70][71].

FIG. 2 .
FIG. 2. (a) The average core packing fraction ⟨φ ⟩ plotted versus the attraction strength α 2 β for the HS+HP protein model for temperatures T /ϵ r = 10 −6 (yellow), 10 −7 (green), and 10 −8 (blue) and α = 0.5 (circles), 1.0 (squares), 1.5 (upward triangles), and 2.0 (downward triangles).The horizontal red dot-dashed line and cyan shading indicate the average and standard deviation of the core packing fraction in the highresolution x-ray crystal structure data set.The black dashed lines indicate fits to Eq. 6.(b) The average repulsive potential energy per atom ⟨V r /N⟩ plotted versus α 2 β .The black dashed lines indicate fits to Eq. 7. (c) ⟨V r /N⟩ plotted versus ⟨φ ⟩.The vertical red dot-dashed line and cyan shading indicate the average and standard deviation of the core packing fraction in the high-resolution x-ray crystal structure data set.The black dashed lines indicate fits to Eq. 8.